John Doe remarked in #AP1432 that there may be too much code in our application that isn't used at all. Before migrating the application to the new platform, we have to analyze which parts of the system are still in use and which are not.
To understand how much code isn't used, we recorded the executed code in production with the coverage tool JaCoCo. The measurement took place between 21st Oct 2017 and 27st Oct 2017. The results were exported into a CSV file using the JaCoCo command line tool with the following command:
java -jar jacococli.jar report "C:\Temp\jacoco.exec" --classfiles \
C:\dev\repos\buschmais-spring-petclinic\target\classes --csv jacoco.csv
The CSV file contains all lines of code that were passed through during the measurement's time span. We just take the relevant data and add an additional LINES
column to be able to calculate the ratio between covered and missed lines later on.
In [1]:
import pandas as pd
coverage = pd.read_csv("datasets/jacoco.csv")
coverage = coverage[['PACKAGE', 'CLASS', 'LINE_COVERED' ,'LINE_MISSED']]
coverage['LINES'] = coverage.LINE_COVERED + coverage.LINE_MISSED
coverage.head(1)
Out[1]:
In [2]:
grouped_by_packages = coverage.groupby("PACKAGE").sum()
grouped_by_packages['RATIO'] = grouped_by_packages.LINE_COVERED / grouped_by_packages.LINES
grouped_by_packages = grouped_by_packages.sort_values(by='RATIO')
grouped_by_packages
Out[2]:
We plot the data for the coverage ratio to get a brief overview of the result.
In [3]:
%matplotlib inline
grouped_by_packages[['RATIO']].plot(kind="barh", figsize=(8,2))
Out[3]: